Broadcast News Story Segmentation Using Manifold Learning on Latent Topic Distributions
نویسندگان
چکیده
We present an efficient approach for broadcast news story segmentation using a manifold learning algorithm on latent topic distributions. The latent topic distribution estimated by Latent Dirichlet Allocation (LDA) is used to represent each text block. We employ Laplacian Eigenmaps (LE) to project the latent topic distributions into low-dimensional semantic representations while preserving the intrinsic local geometric structure. We evaluate two approaches employing LDA and probabilistic latent semantic analysis (PLSA) distributions respectively. The effects of different amounts of training data and different numbers of latent topics on the two approaches are studied. Experimental results show that our proposed LDA-based approach can outperform the corresponding PLSA-based approach. The proposed approach provides the best performance with the highest F1-measure of 0.7860.
منابع مشابه
Two-stage Story Segmentation and Detection on Broadcast News Using Genetic Algorithm
This paper proposes a two-stage story segmentation and detection approach on Mandarin broadcast news. In the two-stage paradigm, a topic classifier is first constructed to find the topic on the broadcast news within a sliding window and determine the potential story boundaries. Then, the problem for story segmentation is transformed to the determination of a chromosome (number sequence) in a se...
متن کاملLexical Story Co-Segmentation of Chinese Broadcast News
We present an unsupervised technique, namely story cosegmentation, to automatically extract the common stories on the same topic within a pair of Chinese broadcast news transcripts. Unlike classical topic tracking that usually relies on previously trained topic models, our method is purely data-driven and is able to simultaneously determine the common stories of the input texts. Specifically, w...
متن کاملBroadcast News Story Boundary Detection Using Visual, Audio and Text Features
News video story segmentation is vital for video summarization, story linking, and curation. We present a multimodal segmentation algorithm which fuses video, audio and text cues for story boundary detection. We show that broadcast news closed captioning is a rich and readily available source that improves story boundary detection. Furthermore, we propose an empirical distribution-based feature...
متن کاملTracking topics in broadcast news data
This paper describes a topic tracking system and its ability to cope with sparse training data for broadcast news tracking. The baseline tracker which relies on a unigram topic model. In order to compensate for the very small amount of training data for each topic, document expansion is used in estimating the initial topic model, and unsupervised model adaptation is carried out after processing...
متن کاملOnline Story Segmentation of Multilingual Streaming Broadcast News
We present an online story segmentation approach for Broadcast News (BN) that is built upon and integrated into BBN COTS multilingual Broadcast Monitoring System (BMS). We take a discriminative model-based approach, using Support Vector Machines to segment BN transcriptions into thematically coherent stories within the real-time constraints defined by BMS. We extract lexical, topical and story ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013